Systems and methods for epigenetic sequencing

ABSTRACT

The present invention generally relates to microfluidics and/or epigenetic sequencing. In one set of embodiments, cells contained within a plurality of microfluidic droplets are lysed and the DNA (e.g., from nucleosomes) within the droplets are labeled, e.g., with adapters containing an identification sequence. The adapters may also contain other sequences, e.g., restriction sites, primer sites, etc., to assist with later analysis. After labeling with adapters, the DNA from the different cells may be combined and analyzed, e.g., to determine epigenetic information about the cells. For example, the DNA may be separated on the basis of certain modifications (e.g., methylation), and the DNA from the separated nucleosomes may be sequenced using techniques such as chromatin immunoprecipitation (“ChIP”). In some cases, the DNA sequences may also be aligned with genomes, e.g., to determine which portions of the genome were epigenetically modified, e.g., via methylation.

RELATED APPLICATIONS

This application is a continuation-in-part of Int. Patent ApplicationNo. PCT/US2013/029123, filed Mar. 5, 2013, entitled “Systems and Methodsfor Epigenetic Sequencing,” which claims the benefit of U.S. ProvisionalPatent Application Ser. No. 61/634,744, filed Mar. 5, 2012, entitled“Systems and Methods for Epigenetic Sequencing,” by Rotem, et al., eachincorporated herein by reference.

FIELD OF INVENTION

The present invention generally relates to microfluidics and/orepigenetic sequencing.

BACKGROUND

Epigenetics is the study of the transmission of genetic information bymechanisms other than the DNA sequence of nucleotides. For example,epigenetic information may be transmitted via methylation of nucleotideswithin DNA (e.g., cytosine to 5-methylcytosine), or by histonemodifications such as histone acetylation or deacetylation, methylation,ubiquitylation, phosphorylation, sumoylation, etc. Such epigeneticmodifications may affect the structure of chromatin, which is ahigher-order structure of protein, DNA and RNA within cells. Chromatinstructure is known to play an important role in regulating genomefunction and in particular, its varied structure across cell types helpsensure that the correct genes are expressed in the correct cell types.

Most techniques for studying epigenetics typically require largepopulations of cells, e.g., thousands of cells. For example, histonemodifications can be mapped by immunoprecipitating chromatin withantibodies to a modified histone and then sequencing the DNA (ChIP-Seq).However, this method typically requires ˜100,000 cells or more.Furthermore, the analysis is carried out on the entire population and isblind to differences among cells.

In contrast, however, systems and methods for studying the epigenomes insingle cells or small numbers of cells are becoming increasinglyimportant for understanding the principles of chromatin and genomeregulation. Moreover, such approaches could have many clinicalapplications in cancer biology, immunology, neuroscience or other fieldsin which subject tissues are complex, heterogeneous and/or limited insize. For example, tumors represent heterogeneous mixtures of cells thatmay be driven by sub-populations of cancer stem cells. Single cellepigenomic profiling methods could improve understanding of criticalepigenomic changes in cancer stem cells. They might also enable earlydetection or surveillance of disease.

SUMMARY

The present invention generally relates to microfluidics and/orepigenetic sequencing. The subject matter of the present inventioninvolves, in some cases, interrelated products, alternative solutions toa particular problem, and/or a plurality of different uses of one ormore systems and/or articles.

In one set of embodiments, the present invention is generally directedto a method comprising acts of providing a plurality of cells containedwithin a plurality of microfluidic droplets, lysing the cells containedwithin the microfluidic droplets to produce cell lysates therein,exposing at least some of the cell lysates contained within themicrofluidic droplets to a non-nucleosome-cleaving nuclease to produce aplurality of nucleosome sequences within the microfluidic droplets,ligating adapters to at least some of the nucleosome sequences, theadapters comprising an identification sequence and a restriction site,and sequencing the nucleosome sequences containing ligated adapters. Inanother set of embodiments, the present invention is generally directedto a method of providing a solution comprising a plurality of nucleosomesequences originating from a plurality of cells, at least some of thenucleosome sequences being ligated to an adapter, the adapter comprisingan identification sequence and a restriction site, wherein nucleosomesequences originating from the same cell contain identicalidentification sequences, and nucleosome sequences originating fromdifferent cells contain different identification sequences, andsequencing at least some of the nucleosome sequences.

The present invention, in yet another set of embodiments, is generallydirected to a composition comprising a plurality of droplets, at leastsome of which each contain a nucleosome sequence ligated to an adapter,the adapter comprising an identification sequence and a restrictionsite, wherein nucleosome sequences originating from the same cellcontain identical identification sequences, and nucleosome sequencesoriginating from different cells contain different identificationsequences.

In still another set of embodiments the present invention is generallydirected to a composition comprising a nucleic acid sequence comprisingan identification sequence, a restriction site, an inversion of therestriction site, and an inversion of the identification sequence.

In another set of embodiments, the present invention is generallydirected to a composition comprising a plurality of palindromic orsubstantially palindromic nucleic acid sequence comprising a pluralityof different identification sequences each having the same length, and asubstantially identical restriction site.

In another aspect, the present invention encompasses methods of makingone or more of the embodiments described herein. In still anotheraspect, the present invention encompasses methods of using one or moreof the embodiments described herein.

Other advantages and novel features of the present invention will becomeapparent from the following detailed description of various non-limitingembodiments of the invention when considered in conjunction with theaccompanying figures. In cases where the present specification and adocument incorporated by reference include conflicting and/orinconsistent disclosure, the present specification shall control. If twoor more documents incorporated by reference include conflicting and/orinconsistent disclosure with respect to each other, then the documenthaving the later effective date shall control.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting embodiments of the present invention will be described byway of example with reference to the accompanying figures, which areschematic and are not intended to be drawn to scale. In the figures,each identical or nearly identical component illustrated is typicallyrepresented by a single numeral. For purposes of clarity, not everycomponent is labeled in every figure, nor is every component of eachembodiment of the invention shown where illustration is not necessary toallow those of ordinary skill in the art to understand the invention. Inthe figures:

FIGS. 1A-1C illustrate a method of epigenetic sequencing in accordancewith certain embodiments of the invention;

FIGS. 2A-2B illustrate cells that are lysed and “tagged” with adapters,in another embodiment of the invention;

FIG. 3 illustrates the structure of an adapter, corresponding to SEQ IDNOs:5-8 top to bottom, in accordance with another embodiment of theinvention;

FIGS. 4A-4D illustrate cell preparation in another embodiment of theinvention;

FIGS. 5A-5D illustrate unique barcodes within a population of cells, inyet another embodiment of the invention;

FIG. 6 illustrates ChIP sequencing technique, in accordance with stillanother embodiment of the invention;

FIGS. 7A-7B illustrate adapters according to another embodiment of theinvention; and

FIGS. 8A-8C illustrate the study of different populations of cells, inyet another embodiment of the invention.

DETAILED DESCRIPTION

The present invention generally relates to microfluidics and/orepigenetic sequencing. In one set of embodiments, cells contained withina plurality of microfluidic droplets are lysed and the DNA (e.g., fromnucleosomes) within the droplets are labeled, e.g., with adapterscontaining an identification sequence. The adapters may also containother sequences, e.g., restriction sites, primer sites, etc., to assistwith later analysis. After labeling with adapters, the DNA from thedifferent cells may be combined and analyzed, e.g., to determineepigenetic information about the cells. For example, the DNA may beseparated on the basis of certain modifications (e.g., methylation)using techniques such as chromatin immunoprecipitation (“ChIP”), and theDNA from the separated nucleosomes may be sequenced and aligned withgenomes, e.g., to determine which portions of the genome wereepigenetically modified, e.g., via methylation.

Thus, various aspects of the present invention are generally directedthrough DNA sequencing to determine epigenetic information. Forinstance, in one set of embodiments, cells contained within microfluidicdroplets are lysed and their DNA is exposed to certain enzymes, such asnon-nucleosome-cleaving nucleases or other enzymes that are able tocleave the DNA for later analysis without destroying certain types ofepigenetic information, e.g., the interaction of the DNA with thehistones, the methylation patterns within DNA, etc. The DNA (stillcontained within the microfluidic droplets) is then barcoded or taggedusing a ligation step with specific “adapters” that can be used to lateridentify the source, down to a single cell level in some cases, of theDNA.

After ligation of the adapters to the DNA, the DNA may then besequenced. In certain embodiments, droplets containing the lysate ofdifferent cells may be combined together prior to analysis. Due to thepresence of the ligated adapters, the DNA originating from the same cellmay contain identical identification sequences, while DNA originatingfrom different cells may contain different identification sequences,thereby allowing the DNA of each cell within a sample to be separatelyidentified and determined. In some cases, for instance, theidentification sequences may be selected such that a certain number ofcells taken from a plurality of cells can be readily identified. Forexample, an identification sequence of n nucleotides may allow for up to4′ distinct cells to be studied. Thus, even relatively smallidentification sequences (e.g., 4, 5, 6, 7, 8, 9, etc. nucleotides long)may allow for relatively large populations of cells to be separatelydetermined, e.g., at a single-cell level.

One non-limiting example is illustrated with reference to FIG. 6. Inthis figure, a cell (containing chromatin) is contained within amicrofluidic droplet. The cell is lysed and exposed to an enzyme such asMNase, which cleaves the DNA released from the lysed cell into smallerfragments without substantially affecting those portions of the DNA thatinteract with the histones within the nucleosome structures. Barcodes orother adapters, which may be formed from certain types of sequences asis discussed herein, are added to the droplet and ligated together.Next, ChIP analysis is performed and the DNA sequenced.

Another example of an embodiment of the invention is now described withrespect to FIG. 1. In FIG. 1A, part 1, and in FIG. 2A, a plurality ofcells are contained within droplets using a microfluidic droplet maker,e.g., within an aqueous environment contained within an oil. Typically,the cells are encapsulated within droplets at a density such that onaverage, each droplet contains one cell (or less). Within a droplet, acell may be lysed, then exposed to an enzyme such as MNase, whichcleaves the DNA released from the lysed cell into smaller fragmentswithout substantially affecting those portions of the DNA that interactwith the histones within the nucleosome structures. Droplets containingan adapter (as discussed below), a ligase, and/or other components suchas buffers or the like are also created separately, as shown in FIG. 1A,part 2. Similar to FIG. 1A, part 1, the adapters may be contained in adroplet within an aqueous environment, contained within an oil.

A schematic diagram of an example of an adapter is shown in FIG. 1B,part 1, and FIG. 3. The adapter in this example comprises a sequencerecognizable by a primer (e.g., a PCR primer), a “barcode” or otheridentification sequence, and a restriction site that can be cleaved by asuitable restriction endonuclease. The identification site typically has4-15 nucleotides, and for a population of adapters to be used toidentify a population of cells, the identification sequence of theadapter may differ while the rest of the adapter is substantiallyconstant. Accordingly, because each cell is exposed to an adaptercontaining a different identification sequence, the nucleic acidsarising from these cells may be subsequently distinguished. In thisexample, the adapter is palindromic or at least substantiallypalindromic, so that the adapter further contains inverses of these,e.g., as part of a double-stranded structure.

For example, FIG. 3 shows the structure of a typical adapter attached toa length of DNA 5, including an identification region 10, a restrictionsite (including a recognition sequence 22 and a cleavage sequence 24),and a primer sequence 30. Note that DNA 5 in this example actually hastwo adapters, one on either side, having the same structure. Due to thegenerally palindromic nature of these adapters, the adapters cannot beadded to the DNA incorrectly, as either orientation would be correct. Inaddition, in some embodiments, additional adapters could potentially beligated onto the ends of the adapters. However, due to subsequentcleavage by a suitable restriction endonuclease at the cleavage site,any unwanted or extra adapters can be readily separated from the DNAitself, leaving just the identification sequence remaining on one orboth ends, for subsequent determination or analysis.

The droplets containing the cells may be merged with the dropletscontaining the adapters, as is shown in FIG. 1B, part 2, as well as FIG.2B (where the tag library is formed from the adapters). Varioustechniques may be used to fuse the droplets together, and typically, thedroplets are fused in a 1:1 ratio such that a single cell (containedwithin a single droplet) is exposed to a unique adapter (i.e.,containing a unique identification sequence). Within the droplets, theadapters are ligated to the DNA released from the lysed cells. As eachdroplet typically contains a unique adapter, the DNA in each droplet isuniquely identified by a unique identification sequence. A plurality ofDNA molecules is typically found within each droplet, some or all ofwhich are thus labeled by the same adapter.

Next, the DNA within the droplets may be sequenced or analyzed, as isshown in FIG. 1C, part 1. In some cases, DNA from different droplets maybe combined together prior to analysis. As noted above, the presence ofunique identification sequences on the adapters on the DNA may allow theDNA from each of the droplets to be analyzed and distinguished.Accordingly, for example, the DNA (and the epigenetic profile) from afirst cell can be distinguished from the DNA from a second cell.Examples of unique identification sequences include those discussed inU.S. Pat. Apl. Ser. No. 61/703,848, incorporated herein by reference inits entirety.

In one set of embodiments, ChIP (“chromatin immunoprecipitation”) may beused to analyze the DNA. For example, the DNA may be amplified (e.g.,using PCR via the primer sequences), cleaved (e.g., using a restrictionendonuclease that cleaves the adapter site, for instance, BciVI as isshown in this example), and/or ligated (e.g. using Illumina) such thatthe DNA is sequenced. A non-limiting example of such analysis is shownschematically in FIG. 1C, part 2. In this example, a plurality of cellsmay each be uniquely identified (for instance, upon selection of asuitable of nucleotides within the identification sequence), e.g., evenfor populations of 100 cells, 1 million cells, etc. However, it shouldbe understood that other techniques could also be used for sequencing.

The above discussion is a non-limiting example of an embodiment of thepresent invention that can be used to determine an epigenetic profile ofa cell. However, other embodiments are also possible. Accordingly, moregenerally, various aspects of the invention are directed to varioussystems and methods for sequencing the DNA of cells (typically containedwithin droplets) to determine epigenetic information.

In one aspect, microfluidic droplets are used, for example, to containcells. Microfluidic droplets may be used to keep the cells of aplurality of cells separate and identifiable, e.g., such that epigeneticor genetic differences between the different cells may be identified. Incontrast, in many prior art techniques, a plurality or a population ofdifferent cells may be studied for epigenetic differences, but there isno ability to determine those epigenetic differences on the level of anindividual cell; instead, only average epigenetic profiles of thosecells can be determined. In contrast, in certain embodiments of thepresent invention, a plurality of cells, some or all of which maycontain individual epigenetic differences, may be studied, atresolutions down to the single-cell level, for example, withinmicrofluidic droplets or other compartments or solutions such as thosediscussed herein.

The cells may arise from a human, or from a non-human animal, forexample, an invertebrate cell (e.g., a cell from a fruit fly), a fishcell (e.g., a zebrafish cell), an amphibian cell (e.g., a frog cell), areptile cell, a bird cell, or a mammal cell, such as a monkey, ape, cow,sheep, goat, horse, donkey, camel, llama, alpaca, rabbit, pig, mouse,rat, guinea pig, hamster, dog, cat, etc. If the cell is from amulticellular organism, the cell may be from any part of the organism.In some embodiments, a tissue may be studied. For example, a tissue froman organism may be processed to produce cells (e.g., through tissuehomogenization or by laser-capturing the cells from the tissue), suchthat the epigenetic differences within the tissue may be determined, asdiscussed herein.

The cells or tissues may arise from a healthy organism, or one that isdiseased or suspected of being diseased. For example, blood cells froman organism may be removed and studied to determine epigeneticdifferences or changes in the epigenetic profile of those cells, e.g.,to determine if the animal is healthy or has a disease, for example, ifthe animal has cancer (e.g., by determining cancer cells within theblood). In some cases, a tumor may be studied (e.g., using a biopsy),and the epigenetic profile of the tumor may be determined. For instance,the cells may be studied to determine if any of the cells are cancerstem cells.

The cells may also be determined using other techniques, in addition tothe ones discussed herein, which may assist in determining theepigenetic profile of the cells. For example, the cells may be studiedusing flow cytometry, microscopy, the cells may be cultured, etc., todetermine whether the epigenetic profile (or changes in the epigeneticprofile) correlate to other changes in the cell, for example, expressionlevels of a protein, changes in morphology, ability to reproduce ordifferentiate, etc.

In some aspects of the invention, a plurality of cells is containedwithin a plurality of droplets or other compartments. In some cases, theencapsulation rate may be kept low, for example, such that the averagedensity is about 1 cell/droplet or compartment, or less. (In othercases, higher densities are also possible, of course, e.g., greater than1 cell/droplet or compartment.) For example, the average density may beless than about 0.95 cells/droplet or compartment, less than about 0.9cells/droplet or compartment, less than about 0.8 cells/droplet orcompartment, less than about 0.7 cells/droplet or compartment, less thanabout 0.6 cells/droplet or compartment, less than about 0.5cells/droplet or compartment, less than about 0.4 cells/droplet orcompartment, less than about 0.3 cells/droplet or compartment, or lessthan about 0.2 cells/droplet or compartment. In some cases, the cellsare contained such that no more than about 25%, no more than about 15%,no more than about 10% no more than about 5%, no more than about 3%, orno more than about 1% of the droplets or compartments contains more thanone cell therein. Such relatively low densities may be useful, e.g., toavoid confusion of having more than one cell labeled with the sameidentification sequence (e.g., such that the DNA of the cell is ligatedto an adapter), as discussed below.

The droplets may be contained in a microfluidic channel. For example, incertain embodiments, the droplets may have an average dimension ordiameter of less than about 1 mm, less than about 500 micrometers, lessthan about 300 micrometers, less than about 200 micrometers, less thanabout 100 micrometers, less than about 75 micrometers, less than about50 micrometers, less than about 30 micrometers, less than about 25micrometers, less than about 10 micrometers, less than about 5micrometers, less than about 3 micrometers, or less than about 1micrometer in some cases. The average diameter may also be at leastabout 1 micrometer, at least about 2 micrometers, at least about 3micrometers, at least about 5 micrometers, at least about 10micrometers, at least about 15 micrometers, or at least about 20micrometers in certain instances. The droplets may be spherical ornon-spherical. The average diameter or dimension of a droplet, if thedroplet is non-spherical, may be taken as the diameter of a perfectsphere having the same volume as the non-spherical droplet.

The droplets may be produced using any suitable technique. For example,a junction of channels may be used to create the droplets. The junctionmay be, for instance, a T-junction, a Y-junction, achannel-within-a-channel junction (e.g., in a coaxial arrangement, orcomprising an inner channel and an outer channel surrounding at least aportion of the inner channel), a cross (or “X”) junction, a flow-focusjunction, or any other suitable junction for creating droplets. See, forexample, International Patent Application No. PCT/US2004/010903, filedApr. 9, 2004, entitled “Formation and Control of Fluidic Species,” byLink, et al., published as WO 2004/091763 on Oct. 28, 2004, orInternational Patent Application No. PCT/US2003/020542, filed Jun. 30,2003, entitled “Method and Apparatus for Fluid Dispersion,” by Stone, etal., published as WO 2004/002627 on Jan. 8, 2004, each of which isincorporated herein by reference in its entirety. In some embodiments,the junction may be configured and arranged to produce substantiallymonodisperse droplets.

In some cases, the cells may be encapsulated within the droplets at arelatively high rate. For example, the rate of cell encapsulation indroplets may be at least about 10 cells/s, at least about 30 cells/s, atleast about 100 cells/s, at least about 300 cells/s, at least about1,000 cells/s, at least about 3,000 cells/s, at least about 10,000cells/s, at least about 30,000 cells/s, at least about 100,000 cells/s,at least about 300,000 cells/s, or at least about 10⁶ cells/s.

The droplets may be substantially monodisperse in some embodiments, orthe droplets may have a homogenous distribution of diameters, e.g., thedroplets may have a distribution of diameters such that no more thanabout 10%, no more than about 5%, no more than about 3%, no more thanabout 2%, or no more than about 1% of the droplets have a diameter lessthan about 90% (or less than about 95%, less than about 97%, or lessthan about 99%) and/or greater than about 110% (or greater than about101%, greater than about 103%, or greater than about 105%) of theoverall average diameter of the plurality of droplets. In someembodiments, the plurality of droplets has an overall average diameterand a distribution of diameters such that the coefficient of variationof the cross-sectional diameters of the droplets is less than about 10%,less than about 5%, less than about 2%, between about 1% and about 10%,between about 1% and about 5%, or between about 1% and about 2%. Thecoefficient of variation may be defined as the standard deviationdivided by the mean, and can be determined by those of ordinary skill inthe art.

In some embodiments, the fluid forming the droplets is substantiallyimmiscible with the carrying fluid surrounding the droplets. Forexample, the fluid may be hydrophilic or aqueous, while the carryingfluid may be hydrophobic or an “oil,” or vice versa. Typically, a“hydrophilic” fluid is one that is miscible with pure water, while a“hydrophobic” fluid is a fluid that is not miscible with pure water. Itshould be noted that the term “oil,” as used herein, merely refers to afluid that is hydrophobic and not miscible in water. Thus, the oil maybe a hydrocarbon in some embodiments, but in other embodiments, the oilmay be (or include) other hydrophobic fluids (for example, octanol). Itshould also be noted that the hydrophilic or aqueous fluid need not bepure water. For example, the hydrophilic fluid may be an aqueoussolution, for example, a buffer solution, a solution containing adissolved salt, or the like. A hydrophilic fluid may also be, orinclude, for example, ethanol or other liquids that are miscible inwater, e.g., instead of or in addition to water.

In one aspect, after the cells are contained or encapsulated withindroplets or other compartments, the cells may be lysed or otherwiseprocessed to release the DNA within the cells, e.g., as a plurality ofnucleosome sequences. Typically, the nucleosome sequences are thoseregions of the DNA that interact with the histones. The nucleosomesequence of the DNA typically winds around one or more histones toproduce the basic nucleosome structure, which is subsequently packagedwithin the chromatin of the cell. For example, the cells may be lysedwithin the droplets by sonication (exposure to ultrasound), temperatureor osmotic changes, exposure to certain types of enzymes or chemicals(for example, detergents such as Triton, e.g., Triton X-100), or thelike. Those of ordinary skill in the art will be aware of suitabletechniques for lysing cells to produce a cell lysate. Typically, thecells are lysed within the droplets without breaking down the dropletsthemselves, e.g., such that the cell lysate that is subsequentlyproduced remains within the droplets.

The DNA may also be exposed to enzymes which are able to process the DNAwithout substantially affecting the epigenetic information of interest,e.g., without substantially altering the methylation of nucleotideswithin the DNA, without altering any histone modifications that might bepresent, etc. For example, in one set of embodiments, the DNA may beexposed to a non-nucleosome-cleaving nuclease able to cleave the DNA atregions other than where the DNA is contained within a nucleosome. Sucha nuclease may accordingly be able to cleave the DNA into smallerfragments that can be subsequently analyzed (e.g., as discussed herein),without substantially affecting those portions of the DNA that interactwith the histones within the nucleosome structures (i.e., the nucleosomesequences within the DNA). Accordingly, epigenetic information containedwithin the nucleosome structures may be preserved for subsequentdetermination. One example of a suitable enzyme is MNase (S7 nuclease ormicrococcal nuclease), which is available commercially. In some cases, arestriction enzyme that targets a specific sequence may be used todigest genomic regions having particular sequence contents, such asGC-rich euchromatic loci.

In certain aspects, one or more adapters may be ligated or otherwisebonded onto the DNA (or RNA, in some embodiments). The adapter may beformed from DNA and/or RNA. The adapter may be single-stranded ordouble-stranded, and in some cases, the adapters may be palindromic orsubstantially palindromic, or in some cases the adapter can be singlestranded. In one set of embodiments, the adapter may include anidentification sequence and a restriction site (and/or a portion of arestriction site), and optionally a primer sequence. If the adapter isat least substantially palindromic, the adapter may also containinversions of these, e.g., the adapter may contain an identificationsequence, a restriction site, an inversion of the restriction site, andan inversion of the identification sequence.

One example of an adapter is shown in FIG. 7. In this example, adapter70 includes a portion of a restriction site 71 (“site”), a first“barcode” or identification sequence 72, a second complete restrictionsite 73, a second “barcode” or identification sequence 74, and a secondportion of a restriction site 75 (“restriction”). Adapter 70 alsoincludes a region 76 that can be recognized by a primer. The adapter isjoined to a stretch of nucleic acid, such as DNA 77, to be studied(e.g., containing a nucleosome 80 as is shown in FIG. 7, although thisis just for explanatory purposes). In some cases, the first and secondportions of the restriction site have the same sequence, e.g., forrestriction sites that are palindromic in nature.

A restriction site is a site that is recognized by a restrictionendonuclease. When the adapter is exposed to a restriction endonucleasethat recognizes the restriction site, the restriction endonuclease maycleave the adapter within the restriction site. Those of ordinary skillin the art will be familiar with restriction endonucleases andrestriction sites. The restriction endonuclease may cleave therestriction site to leave behind blunt ends or “sticky” ends (e.g.,leaving an overhang with one or more nucleotides lacking a complement).The restriction site, in some cases, includes a recognition sequence (aspecific sequence of nucleotides, e.g., 4, 5, 6, 7, or 8 nucleotideslong) and a cleavage sequence that may be part of, or be separate from,the recognition sequence. For instance, with the BciVI restriction site,the restriction site includes a recognition sequence (which is 6nucleotides in length as is shown in FIG. 3), where the restrictionendonuclease recognizes the adapter, and a separate cleavage sequencewhere the restriction endonuclease actually cleaves the adapter(indicated by the jagged lines in FIG. 3). Those of ordinary skill inthe art will be able to identify suitable restriction endonucleases andtheir restriction sites. Over 3000 restriction enzymes have been studiedin detail, and more than 600 of these are available commercially.Non-limiting examples include BamHI, BsrI, NotI, XmaI, PspAI, DpnI,MboI, MnlI, Eco57I, Ksp632I, DraIII, AhaII, SmaI, MluI, HpaI, ApaI,BclI, BstEII, TaqI, EcoRI, SacI, HindII, HaeII, DraII, Tsp509I, Sau3AI,PacI, etc.

In one set of embodiments, an adaptor may contain a portion of arestriction site, e.g., first portions and second portions such that,when the first portion and the second portion are ligated or otherwisejoined together, a complete restriction site. This may be useful, forexample, in cases where an adapter is ligated to another adapter; thejoined adapters, having a completed restriction site, may be exposed toa suitable corresponding restriction endonuclease that is able to cleaveat the restriction site, thereby removing the extraneous adapters. Anon-limiting example is shown schematically in FIG. 7. In FIG. 7A, afirst portion of the restriction site is labeled “site” and a secondportion is labeled “restriction.” A corresponding restrictionendonuclease will not recognize either portion, unless the portions arejoined together to from a complete restriction site that can berecognized by the restriction endonuclease (i.e., forming the phrase“restriction site” in FIG. 7B).

As mentioned, in one set of embodiments, the adapter is substantially orcompletely palindromic in nature. For example, in some embodiments, atleast 30%, at least 35%, at least 40%, at least 45%, at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, or at least 95% of the adaptermay be palindromic. With a palindromic adapter, the adapter cannot beadded in an incorrect orientation to the nucleic acid. Similarly, if anadapter is added to another adapter, the two adapters will form acomplete restriction site that can be cleaved using a suitablerestriction endonuclease, especially if the restriction site portionsare also palindromic. However, in some cases, the adapter is not fullypalindromic, and there may be “bubble” regions that are not palindromic.For example, identification sequences 72 and/or 74 within the adaptermay be chosen to not be palindromic.

One non-limiting example of an adapter is the following sequence:

(SEQ ID NO: 1) TTAA GGGCTTTC GTATCC GGGGG ACCTTAATTAAGGT GGGGG GGATACCTTTCGGG TTAAIt should be noted that this sequence is not fully palindromic, ascertain regions (such as the underlined portions) are not palindromic.In this example, the two outer underlined regions may be used asidentification sequences within the adapter. It should be noted thatthese regions are mirror images of each other, e.g., for ease ofidentification, rather than palindromes of each other (although this isnot necessarily a requirement of the adapter). Other sequences (such asthe repeating GGGGG portions in this particular example) may also beselected to be nonpalindromic, e.g., so that the adapter does notreadily form stem-loop structures. In addition, the primerTAAGGTGGGGGGGATAC (SEQ ID NO: 2) may be used with this adapter.

The identification sequence may comprise any suitable number ofnucleotides (for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ormore nucleotides). In some cases, a plurality of adapters is prepared,containing identical (or substantially similar) restriction sites butdifferent identification sequences. The different adapters may all thusbe of the same length, in some embodiments. Depending on the number ofnucleotides chosen to be the identification sequence, a number of uniqueadapter sequences can be created. For example, if the identificationsequence is 4 nucleotides long, then up to 4⁴ unique adapters may becreated; if the identification sequence is 5 nucleotides long, then upto 4⁵ unique adapters may be created; etc. (e.g., up to 4^(n) uniquesequences where n is the number of nucleotides in the identificationsequence).

In some cases, the adapter may also contain a suitable primer sequence,e.g., such that the adapter may be recognized by an enzyme used in PCR(“Polymerase Chain Reaction”). Typically, the length of primer sequenceis not more than about 30 nucleotides. Such a primer may be used, forexample, to amplify the DNA (or RNA, in some cases) that the adapter isbound to.

A variety of techniques may be used to prepare the library of adapters.For example, microwell plates, containing wells containing the membersof the library could be fabricated, e.g., using automated techniques,then encapsulated into droplets or other compartments. As anotherexample, a randomized oligomer population could be encapsulated no morethan one in a drop and amplified within each droplet or compartment. Insome cases, a library of adapters may be present on a solid support.Examples include particles such as magnetic particles, hydrogel beads,agarose beads, or sepharose beads.

Any suitable ligase may be used to ligate the adapter to DNA or RNA.Many such ligases are commercially available, e.g., Epicentre®. Inaddition, enzymes such as End-It™ may be used, e.g., to repair DNA endsthat were subjected to MNase degradation.

In some cases, the cell lysate is exposed to a solution containingadapters (e.g., with ligases) to cause the adapters to bind to the DNA.The solution may be contained within a droplet or other compartment. Theaction of the ligases may cause the adapters to be added randomly to theDNA. For example, an adapter may be ligated to one or both ends of a DNAstrand, more than one adapter may be ligated to one end of a strand(e.g., ligated to each other), one or more adapters can be ligated toone or both ends of the DNA and to each other to produce a circular DNAfragment, and in some cases, adapters may be ligated directly to eachother without any DNA in between. However, in a subsequent step, the DNAmay be exposed to a restriction endonuclease that is able to cleave theadapters at a cleavage site. Doing so effectively causes the only halfof the adapter to stay behind on the DNA (containing the identificationsequence), while the rest of the adapters (and any other adapters thatmay have been connected to the first adapter) will be cleaved away.Accordingly, at the end of this step, at least some of the DNA will havean end labeled with a unique identification sequence at one or bothends, and thus, the DNA can be determined using the identificationsequence, even if later mixed with DNA similarly processed butcontaining different identification sequences. Shorter fragments (e.g.,the cleaved ends of the adapters, remains of adapters bound to eachother that also been cleaved into small fragments, etc.) can besubsequently removed through regular DNA purification methods (size cutoff), or left in solution but ignored. Thus, by tracking the differentidentification sequences, the genetic or epigenetic of cells can each beseparately determined, e.g., potentially at a single cell level.

In one set of embodiments, the cell lysate (contained within droplets orother compartments) are fused or coalesced with other dropletscontaining adapters. In some cases, substantially each of the dropletsor compartments contain unique adapters (i.e., the adapters may besubstantially similar, but contain different identification sequences).Accordingly, the cell lysate of each droplet or compartment can beuniquely identified by determining the identification sequences.

Any suitable technique may be used to fuse a first droplet and a seconddroplet together to create a combined droplet. For example, the firstand second droplets each be given opposite electric charges (i.e.,positive and negative charges, not necessarily of the same magnitude),which may increase the electrical interaction of the two droplets suchthat fusion or coalescence of the droplets can occur due to theiropposite electric charges, e.g., to produce the combined droplet. Forinstance, an electric field may be applied to the droplets, the dropletsmay be passed through a capacitor, a chemical reaction may cause thedroplets to become charged, etc.

In another set of embodiments, the separate droplets may not necessarilybe given opposite electric charges (and, in some cases, may not be givenany electric charge), and the droplets may instead be fused through theuse of dipoles induced in the fluidic droplets that causes the fluidicdroplets to coalesce. The dipoles may be induced using an electric fieldwhich may be an AC field, a DC field, etc., and the electric field maybe created, for instance, using one or more electrodes. The induceddipoles in the fluidic droplets may cause the fluidic droplets to becomeelectrically attracted towards each other due to their local oppositecharges, thus causing the droplets to fuse.

Still other examples of fusing or coalescing separate droplets toproduce combined droplets are described in International PatentApplication No. PCT/US2004/010903, filed Apr. 9, 2004, entitled“Formation and Control of Fluidic Species,” by Link, et al., publishedas WO 2004/091763 on Oct. 28, 2004, and International Patent ApplicationNo. PCT/US2004/027912, filed Aug. 27, 2004, entitled “Electronic Controlof Fluidic Species,” by Link, et al., published as WO 2005/021151 onMar. 10, 2005, each incorporated herein by reference in its entirety.After ligation of the adapter, the various droplets or compartmentscontaining DNA may be combined together in some aspects of theinvention, e.g., to produce a common solution containing the DNA.Although the DNA may have arisen from different cells or compartments,due to the presence of the ligated adapters (e.g., containing uniqueidentification sequences), the DNA is now distinguishable. The dropletsmay be combined by removing surfactant, removing the continuous fluidcontaining the droplets, or any other suitable technique.

The DNA may be processed or sequenced using any suitable technique, inaccordance with certain aspects of the invention. For example,techniques such as Chromatin Immunoprecipitation (“ChIP”),ChIP-Sequencing, ChIP-on-chip, fluorescent in situ hybridization,methylation-sensitive restriction enzymes, DNA adenine methyltransferaseidentification (DamID), or bisulfite sequencing may be used to analyzethe labeled DNA. Optionally, the DNA may be amplified, e.g., using PCRtechniques known to those of ordinary skill in the art.

In one set of embodiments, some of the DNA may be analyzed or sequencedto determine a certain feature or characteristic. For example, in oneset of embodiments, some of the DNA, still attached to nucleosomalstructure, may be immunoprecipitated by exposing the fragments to anantibody, for example, a histone-recognizing antibody such asH3-lysine-4-methyl, or the naked DNA to a methylcytosine antibody, or ahydroxymethylcytosine antibody. Thus, for example, DNA having a certainfeature (e.g., methylated histones or deacylated histones) may beremoved and analyzed or sequenced. Other examples of histonemodifications that may be studied include acetylation, methylation,ubiquitylation, phosphorylation and sumoylation.

In some cases, the DNA that is sequenced may be aligned with a genome(e.g., a known genome, such as a human genome) to determine locations ofthe DNA within the genome (e.g., of a particular cell) that exhibit suchfeatures (e.g., methylated histones or deacylated histones). Thus, forexample, certain nucleosomes within the genome may be identified asexhibiting such features. An example of such a study is discussed belowin Example 1.

In other embodiments, RNA molecules, e.g., from individual cells, couldbe “barcoded” or otherwise ligated with an identification sequence indroplets (or other compartments) using single stranded indexed adaptors.The adaptors may be coupled to the RNA molecules, for example, by directligation, by poly-T or random primer-based reverse transcriptionmethods, or by other methods known to those of ordinary skill in theart. In some embodiments, selected RNA sequences could be interrogatedby introducing a collection of single-stranded adaptors each comprisinga barcode or other identification sequence (e.g., indexed to a singlecell) and known sequences complementary to the RNA species of interest,followed by reverse transcription in single-cell-containing droplets.Template switching may be used in some embodiments. Accordingly, itshould be understood that in the embodiments and examples discussedherein using DNA, this is by way of example only, and that in otherembodiments, RNA could be used instead of and/or in addition to DNA.

A variety of materials and methods, according to certain aspects of theinvention, can be used to produce fluidic systems and microfluidicsystems such as those described herein. In some cases, the variousmaterials selected lend themselves to various methods. For example,various components of the invention can be formed from solid materials,in which the channels can be formed via micromachining, film depositionprocesses such as spin coating and chemical vapor deposition, laserfabrication, photolithographic techniques, etching methods including wetchemical or plasma processes, and the like. See, for example, ScientificAmerican, 248:44-55, 1983 (Angell, et al). In one embodiment, at least aportion of the fluidic system is formed of silicon by etching featuresin a silicon chip. Technologies for precise and efficient fabrication ofvarious fluidic systems and devices of the invention from silicon areknown. In another embodiment, various components of the systems anddevices of the invention can be formed of a polymer, for example, anelastomeric polymer such as polydimethylsiloxane (“PDMS”),polytetrafluoroethylene (“PTFE” or Teflon®), or the like.

Different components can be fabricated of the same or differentmaterials. For example, a base portion including a bottom wall and sidewalls can be fabricated from an opaque material such as silicon or PDMS,and a top portion can be fabricated from a transparent or at leastpartially transparent material, such as glass or a transparent polymer,for observation and/or control of the fluidic process. Components can becoated so as to expose a desired chemical functionality to fluids thatcontact interior channel walls, where the base supporting material doesnot have a precise, desired functionality. For example, components canbe fabricated as illustrated, with interior channel walls coated withanother material. Material used to fabricate various components of thesystems and devices of the invention, e.g., materials used to coatinterior walls of fluid channels, may desirably be selected from amongthose materials that will not adversely affect or be affected by fluidflowing through the fluidic system, e.g., material(s) that is chemicallyinert in the presence of fluids to be used within the device.

In one embodiment, various components of the invention are fabricatedfrom polymeric and/or flexible and/or elastomeric materials, and can beconveniently formed of a hardenable fluid, facilitating fabrication viamolding (e.g. replica molding, injection molding, cast molding, etc.).The hardenable fluid can be essentially any fluid that can be induced tosolidify, or that spontaneously solidifies, into a solid capable ofcontaining and/or transporting fluids contemplated for use in and withthe fluidic network. In one embodiment, the hardenable fluid comprises apolymeric liquid or a liquid polymeric precursor (i.e. a “prepolymer”).Suitable polymeric liquids can include, for example, thermoplasticpolymers, thermoset polymers, or mixture of such polymers heated abovetheir melting point. As another example, a suitable polymeric liquid mayinclude a solution of one or more polymers in a suitable solvent, whichsolution forms a solid polymeric material upon removal of the solvent,for example, by evaporation. Such polymeric materials, which can besolidified from, for example, a melt state or by solvent evaporation,are well known to those of ordinary skill in the art. A variety ofpolymeric materials, many of which are elastomeric, are suitable, andare also suitable for forming molds or mold masters, for embodimentswhere one or both of the mold masters is composed of an elastomericmaterial. A non-limiting list of examples of such polymers includespolymers of the general classes of silicone polymers, epoxy polymers,and acrylate polymers. Epoxy polymers are characterized by the presenceof a three-membered cyclic ether group commonly referred to as an epoxygroup, 1,2-epoxide, or oxirane. For example, diglycidyl ethers ofbisphenol A can be used, in addition to compounds based on aromaticamine, triazine, and cycloaliphatic backbones. Another example includesthe well-known Novolac polymers. Non-limiting examples of siliconeelastomers suitable for use according to the invention include thoseformed from precursors including the chlorosilanes such asmethylchlorosilanes, ethylchlorosilanes, phenylchlorosilanes, etc.

Silicone polymers are preferred in one set of embodiments, for example,the silicone elastomer polydimethylsiloxane. Non-limiting examples ofPDMS polymers include those sold under the trademark Sylgard by DowChemical Co., Midland, Mich., and particularly Sylgard 182, Sylgard 184,and Sylgard 186. Silicone polymers including PDMS have severalbeneficial properties simplifying fabrication of the microfluidicstructures of the invention. For instance, such materials areinexpensive, readily available, and can be solidified from aprepolymeric liquid via curing with heat. For example, PDMSs aretypically curable by exposure of the prepolymeric liquid to temperaturesof about, for example, about 65° C. to about 75° C. for exposure timesof, for example, about an hour. Also, silicone polymers, such as PDMS,can be elastomeric, and thus may be useful for forming very smallfeatures with relatively high aspect ratios, necessary in certainembodiments of the invention. Flexible (e.g., elastomeric) molds ormasters can be advantageous in this regard.

One advantage of forming structures such as microfluidic structures ofthe invention from silicone polymers, such as PDMS, is the ability ofsuch polymers to be oxidized, for example by exposure to anoxygen-containing plasma such as an air plasma, so that the oxidizedstructures contain, at their surface, chemical groups capable ofcross-linking to other oxidized silicone polymer surfaces or to theoxidized surfaces of a variety of other polymeric and non-polymericmaterials. Thus, components can be fabricated and then oxidized andessentially irreversibly sealed to other silicone polymer surfaces, orto the surfaces of other substrates reactive with the oxidized siliconepolymer surfaces, without the need for separate adhesives or othersealing means. In most cases, sealing can be completed simply bycontacting an oxidized silicone surface to another surface without theneed to apply auxiliary pressure to form the seal. That is, thepre-oxidized silicone surface acts as a contact adhesive againstsuitable mating surfaces. Specifically, in addition to beingirreversibly sealable to itself, oxidized silicone such as oxidized PDMScan also be sealed irreversibly to a range of oxidized materials otherthan itself including, for example, glass, silicon, silicon oxide,quartz, silicon nitride, polyethylene, polystyrene, glassy carbon, andepoxy polymers, which have been oxidized in a similar fashion to thePDMS surface (for example, via exposure to an oxygen-containing plasma).Oxidation and sealing methods useful in the context of the presentinvention, as well as overall molding techniques, are described in theart, for example, in an article entitled “Rapid Prototyping ofMicrofluidic Systems and Polydimethylsiloxane,” Anal. Chem., 70:474-480,1998 (Duffy, et al.), incorporated herein by reference.

In some embodiments, certain microfluidic structures of the invention(or interior, fluid-contacting surfaces) may be formed from certainoxidized silicone polymers. Such surfaces may be more hydrophilic thanthe surface of an elastomeric polymer. Such hydrophilic channel surfacescan thus be more easily filled and wetted with aqueous solutions.

In one embodiment, a bottom wall of a microfluidic device of theinvention is formed of a material different from one or more side wallsor a top wall, or other components. For example, the interior surface ofa bottom wall can comprise the surface of a silicon wafer or microchip,or other substrate. Other components can, as described above, be sealedto such alternative substrates. Where it is desired to seal a componentcomprising a silicone polymer (e.g. PDMS) to a substrate (bottom wall)of different material, the substrate may be selected from the group ofmaterials to which oxidized silicone polymer is able to irreversiblyseal (e.g., glass, silicon, silicon oxide, quartz, silicon nitride,polyethylene, polystyrene, epoxy polymers, and glassy carbon surfaceswhich have been oxidized). Alternatively, other sealing techniques canbe used, as would be apparent to those of ordinary skill in the art,including, but not limited to, the use of separate adhesives, thermalbonding, solvent bonding, ultrasonic welding, etc.

As mentioned, in some, but not all embodiments, the systems and methodsdescribed herein may include one or more microfluidic components, forexample, one or more microfluidic channels. The “cross-sectionaldimension” of a microfluidic channel is measured perpendicular to thedirection of fluid flow within the channel. Thus, some or all of themicrofluidic channels may have a largest cross-sectional dimension lessthan 2 mm, and in certain cases, less than 1 mm. In one set ofembodiments, the maximum cross-sectional dimension of a microfluidicchannel is less than about 500 micrometers, less than about 300micrometers, less than about 200 micrometers, less than about 100micrometers, less than about 50 micrometers, less than about 30micrometers, less than about 10 micrometers, less than about 5micrometers, less than about 3 micrometers, or less than about 1micrometer. In certain embodiments, the microfluidic channels may beformed in part by a single component (e.g. an etched substrate or moldedunit). Of course, larger channels, tubes, chambers, reservoirs, etc. canalso be used to store fluids and/or deliver fluids to various componentsor systems in other embodiments of the invention.

A microfluidic channel can have any cross-sectional shape (circular,oval, triangular, irregular, square or rectangular, or the like) and canbe covered or uncovered. In embodiments where it is completely covered,at least one portion of the channel can have a cross-section that iscompletely enclosed, or the entire channel may be completely enclosedalong its entire length with the exception of its inlet(s) and/oroutlet(s). A channel may also have an aspect ratio (length to averagecross sectional dimension) of at least 2:1, more typically at least 3:1,5:1, 10:1, 15:1, 20:1, or more.

In some embodiments, at least a portion of one or more of the channelsmay be hydrophobic, or treated to render at least a portion hydrophobic.For example, one non-limiting method for making a channel surfacehydrophobic comprises contacting the channel surface with an agent thatconfers hydrophobicity to the channel surface. For example, in someembodiments, a channel surface may be contacted (e.g., flushed) withAquapel® (a commercial auto glass treatment) (PPG Industries,Pittsburgh, Pa.). In some cases, a channel surface contacted with anagent that confers hydrophobicity may be subsequently purged with air.In some embodiments, the channel may be heated (e.g., baked) toevaporate solvent that contains the agent that confers hydrophobicity.

Thus, in some aspects of the invention, a surface of a microfluidicchannel may be modified, e.g., by coating a sol-gel onto at least aportion of a microfluidic channel. As an example, the sol-gel coatingmay be made more hydrophobic by incorporating a hydrophobic polymer inthe sol-gel. For instance, the sol-gel may contain one or more silanes,for example, a fluorosilane (i.e., a silane containing at least onefluorine atom) such as heptadecafluorosilane, or other silanes such asmethyltriethoxy silane (MTES) or a silane containing one or more lipidchains, such as octadecylsilane or other CH₃(CH₂)_(n)-silanes, where ncan be any suitable integer. For instance, n may be greater than 1, 5,or 10, and less than about 20, 25, or 30. The silanes may alsooptionally include other groups, such as alkoxide groups, for instance,octadecyltrimethoxysilane. In general, most silanes can be used in thesol-gel, with the particular silane being chosen on the basis of desiredproperties such as hydrophobicity. Other silanes (e.g., having shorteror longer chain lengths) may also be chosen in other embodiments of theinvention, depending on factors such as the relative hydrophobicity orhydrophilicity desired. In some cases, the silanes may contain othergroups, for example, groups such as amines, which would make the sol-gelmore hydrophilic. Non-limiting examples include diamine silane, triaminesilane, or N-[3-(trimethoxysilyl)propyl] ethylene diamine silane. Thesilanes may be reacted to form oligomers or polymers within the sol-gel,and the degree of polymerization (e.g., the lengths of the oligomers orpolymers) may be controlled by controlling the reaction conditions, forexample by controlling the temperature, amount of acid present, or thelike. In some cases, more than one silane may be present in the sol-gel.For instance, the sol-gel may include fluorosilanes to cause theresulting sol-gel to exhibit greater hydrophobicity, and/or othersilanes (or other compounds) that facilitate the production of polymers.In some cases, materials able to produce SiO₂ compounds to facilitatepolymerization may be present, for example, TEOS (tetraethylorthosilicate). It should be understood that the sol-gel is not limitedto containing only silanes, and other materials may be present inaddition to, or in place of, the silanes. For instance, the coating mayinclude one or more metal oxides, such as SiO₂, vanadia (V₂O₅), titania(TiO₂), and/or alumina (Al₂O₃).

In some instances, the microfluidic channel is constructed from amaterial suitable to receive the sol-gel, for example, glass, metaloxides, or polymers such as polydimethylsiloxane (PDMS) and othersiloxane polymers. For example, in some cases, the microfluidic channelmay be one in which contains silicon atoms, and in certain instances,the microfluidic channel may be chosen such that it contains silanol(Si—OH) groups, or can be modified to have silanol groups. For instance,the microfluidic channel may be exposed to an oxygen plasma, an oxidant,or a strong acid cause the formation of silanol groups on themicrofluidic channel.

If compartments are used, the compartments may be wells of a microwellplate (e.g., a 96-well, a 384-well, a 1536-well, a 3456-well microwellplate, etc.). In yet other embodiments, the compartments may beindividual tubes or containers, test tubes, microfuge tubes, glassvials, bottles, petri dishes, wells of a plate, or the like. In somecases, the compartments may have relatively small volumes (e.g., lessthan about 1 microliter, less than about 300 nl, less than about 100 nl,less than about 30 nl, less than about 10 nl, less than about 3 nl, lessthan about 1 nl, etc.). In some cases, the compartments may beindividually accessible.

The following documents are incorporated herein by reference in theirentireties: International Patent Application No. PCT/US2004/010903,filed Apr. 9, 2004, entitled “Formation and Control of Fluidic Species,”by Link, et al., published as WO 2004/091763 on Oct. 28, 2004;International Patent Application No. PCT/US2003/020542, filed Jun. 30,2003, entitled “Method and Apparatus for Fluid Dispersion,” by Stone, etal., published as WO 2004/002627 on Jan. 8, 2004; International PatentApplication No. PCT/US2006/007772, filed Mar. 3, 2006, entitled “Methodand Apparatus for Forming Multiple Emulsions,” by Weitz, et al.,published as WO 2006/096571 on Sep. 14, 2006; International PatentApplication No. PCT/US2004/027912, filed Aug. 27, 2004, entitled“Electronic Control of Fluidic Species,” by Link, et al., published asWO 2005/021151 on Mar. 10, 2005; International Patent Application No.PCT/US2007/002063, filed Jan. 24, 2007, entitled “Fluidic DropletCoalescence,” by Ahn, et al., published as WO 2007/089541 on Aug. 9,2007; International Patent Application No. PCT/US2008/013912, filed Dec.19, 2008, entitled “Systems and Methods for Nucleic Acid Sequencing,” byWeitz, et al., published as WO 2009/085215 on Jul. 9, 2009; andInternational Patent Application No. PCT/US2008/008563, filed Jul. 11,2008, entitled “Droplet-Based Selection,” by Weitz, et al., published asWO 2009/011808 on Jan. 22, 2009. Also incorporated by reference in itsentirety is U.S. Provisional Patent Application Ser. No. 61/634,744,filed Mar. 5, 2012, entitled “Systems and Methods for EpigeneticSequencing,” by Rotem, et al.

The following examples are intended to illustrate certain embodiments ofthe present invention, but do not exemplify the full scope of theinvention.

Example 1

This example illustrates certain systems and methods for profilingepigenomes of single cells within populations using droplet basedmicrofluidics.

All cell types in the human contain essentially identical genomes (i.e.,the DNA sequence). However they vary in terms of how the DNA isorganized by chromatin, a higher order structure of protein, DNA andRNA. Chromatin structure plays an important role in regulating genomefunction and in particular its varied structure across cell types helpsensure that the correct genes are expressed in the correct cell types.Chromatin structure is regulated by histone modifications and DNAmethylation. Thus, genomewide maps of histone modifications or DNAmethylation in a given cell type are a valuable research tool. Thesemaps are collectively referred to as epigenomic profiles or“epigenomes.” In addition to their value for understanding normaldevelopment, epigenomic profiles have clinical relevance as they canidentify defects in genome regulation in cancer or other diseases,propose therapeutic strategies or serve as diagnostic or early detectionbiomarkers.

Current methods for profiling epigenomes require thousands of cells. Forexample, histone modifications can be mapped by immunoprecipitatingchromatin with antibodies to a modified histone and then sequencing theDNA (ChIP-seq). However, this method requires ˜100,000 cells or more.Furthermore, the analysis is carried out on the entire population and isblind to differences among cells. Approaches capable of profilingepigenomes in single cells are important for understanding theprinciples of chromatin and genome regulation. Moreover, such approachescould have many clinical applications in cancer biology, immunology,neuroscience or other fields in which subject tissues are complex,heterogeneous and/or limited in size. For example, tumors representheterogeneous mixtures of cells that may be driven by sub-populations ofcancer stem cells. Single cell epigenomic profiling methods couldimprove understanding of critical epigenomic changes in cancer stemcells. They might also enable early detection or surveillance ofdisease.

This example describes an epigenomic profiling method that can be usedto map histone modifications in hundreds, thousands or more individualcells in a population. This example uses microfluidic devices to capturesingle cells in single droplets. The cells are then lysed and the genomeis fragmented by enzymatic digestion in the droplets. Finally, DNAoligonucleotides with unique “barcodes” or identification sequences areadded to each droplet and ligated to the fragmented genomic DNAsequences, thus providing a unique identifier for each individual cell.The materials now includes DNA fragments wrapped around histones and“bar coded” according to the cell from which they originated. Thematerials can now be combined (e.g., “combined and indexed chromatinmatter”) and subjected to epigenomic profiling.

In some embodiments, profiling can be performed by using chromatinimmunoprecipitation using an antibody against a modified histone (e.g.,histone H3 lysine 4 trimethyl). After “pull-down” or separation ofnucleosomes associated with this modified histone form (e.g., on asubstrate), the DNA is isolated and bar-coded fragments are selectivelyintroduced into a sequencing library. The DNA is then sequenced, forexample, using next-generation sequencing instruments (e.g., IlluminaHiSeq).

After sequencing, data may be processed in the following succession: (i)each read is assigned to an original cell based on its bar code; (ii)each read is aligned to the genome based on the sequence attached to thebar code; (iii) genomewide profiles are generated for each cell, basedon the union of reads with the same bar code—specifically, the profilesreflect the density of reads as a function of genomic position.Furthermore, (iv) clustering algorithms can be applied to the individualprofiles and used to identify dominant patterns characteristic ofdifferent cellular states in a heterogeneous population.

As a specific example, using a microfluidic device, cells may beencapsulated in drops at a density of at most one cell per drop. Thecells are lysed and chromatin is fragmented by MNase enzymatic digestioninto its single units, called nucleosomes (special buffer was optimized,including Triton, MNase and CaCl₂ to complement MNase requirements).Each nucleosome included a segment of DNA wound around a histone proteincore. By a process of droplet fusion, each droplet containing cells isfused with a droplet containing a cocktail of enzymes, including EndIt(Epicentre: repairs DNA ends that were subjected to MNase digestion),ligase (Epicentre), modified buffer including EGTA that stops the MNasedigestion, and double-stranded, barcoded oligonucleotide adapters.

The oligonucleotide adapter comprised a barcode that is unique, specificto each droplet (or individual cell). It also contains a universal PCRprimer sequence and a restriction site (i.e., the oligonucelotides varyin terms of their barcodes, but are constant in terms of the primersequence and restriction site). The enzyme cocktail effectively ligatesbarcoded adapters to the ends of the DNA fragments in the droplet. Thus,after barcoding, each piece includes a fragment of genome flanked bybarcoded adapters and wrapped around histones. Before breaking thedroplets and merging them to form one aqueous volume, dilution buffer issupplemented, including both EGTA and EDTA in concentrations that willstop any enzymatic reaction and maintain detergent levels.

Since these complexes are “bar-coded” with identification sequences bytheir cell of origin, they may now be combined for epigenomic profiling.Specifically, the droplets are pooled together and broken down to formone aqueous volume. This combined and indexed chromatin matter is thensubjected to immunoprecipitation or “ChIP” using an antibody, e.g., anantibody against a histone modification (e.g., H3 lysine 4 trimethyl orH3 lysine 4 methyl). This enriches for fragments associated withhistones having this modification. The enriched DNA is then isolatedusing any suitable technique.

Fragments within the enriched DNA sample that have “bar-coded” adaptersattached can be selected by amplification followed by restriction, usingthe universal primer and the restriction sites on the adapters. Therestriction event leaves an end that is compatible with a next round ofligation to sequencing adapters. The result is a sequencing library thatcontains “bar-coded” sequences from the epigenomic enrichment assay. The“bar-codes” may serve as indexes that allow DNA fragments to be assignedto individual single cells. The fragments can also be aligned to agenome. For example, a computational pipeline design includingdemultiplexing of the sequenced DNA can be aligned to a genome usingknown techniques such as a Bowtie algorithm, a Peak-calling usingScripture algorithm, and/or clustering to elucidate different populationprofiles.

The cells may be encapsulated in drops at rates of thousands per second,or millions per hour. To prepare a library of unique barcodes, thecontents of a micro-titer well plate containing oligonucleotides (e.g.,an oligonucleotide library) may be sequenced. In some cases, arandomized oligomer population can be encapsulated at no more than onein a drop and amplified inside each drop to create a homogenizedoligomer drop catalog.

Example 2

This example illustrates various techniques useful for epigeneticsequencing in accordance with certain embodiments of the invention

Cell culture. K562 erythrocytic leukaemia cells (ATCC CCL-243) weregrown according to standard protocols in RPMI 1640 media (Invitrogen,22400105) supplemented with 10% fetal bovine serum (FBS, AtlasBiologicals, F-0500-A) and 10% penicillin/streptomycin (Invitrogen,15140122).

Cell lysis and chromatin digestion in droplets. Using a microfluidicdevice, cells were encapsulated in droplets at a density of at most onecell per dropret.

The cells were lysed and chromatin was fragmented in 1% Triton, 0.1%sodium deoxycholate, 50 mM Tris-HCl pH 7.5, 150 mM NaCl supplementedwith 10 units/ml of MNase (Thermo scientific, 88216), 1 mM CaCl₂ andEDTA-free protease inhibitor (Roche, 13015000). The cells were incubatedfor 10 min at 4° C., 15 min at 37° C., and put back at 4° C. until thenext step.

Adapter ligation in droplets. Each nucleosome was formed from a segmentof DNA wound around a histone protein core. By a process of dropletfusion, the droplets (typically containing a single cell) was fused witha droplet contains unique barcoded adaptor in a final concentration of500 micromolar. Additional buffer was pico-injected into the fuseddroplets at the same time. This buffer had a total volume of 104microliters and contained 8 microliters End-It™ (Epicentre, ER81050)that repairs DNA ends that were subjected to MNase digestion, 20microliters End-It™ buffer, 8 Fast link ligase (Epicentre, LK6201H), 20microliters fast link ligation buffer, 20 microliters dNTPs, 12microliters from 10 mM ATP, and 8 microliters EGTA to a finalconcentration of 40 mM that stopped the MNase digestion.

The cells may be encapsulated in droplets at rates of thousands persecond, or millions per hour. To prepare a library of unique barcodes,the contents of a micro-titer well plate containing the oligo-librarycould be encapsulated in droplets. Alternatively, a randomized oligomerpopulation could be encapsulated no more than one in a droplet andamplified inside each droplet to create a homogenized oligomer dropletcatalog.

Adapter design. The oligomer adapters used in this example comprised an8-mer identification sequence (or “barcode”) that was unique to eachdroplet (individual cell). The adapters also contained a universal PCRprimer sequence (forward: ACACGCAGTATCCCTTCG (SEQ ID NO: 3), reverse:ACTGCGTGTATCCGACTC (SEQ ID NO: 4)) and a restriction site for BciVI(NEB, R0596S) that cuts at a 3′ overhang. Thus, the oligomers varied interms of their identification sequences, but were constant in terms ofthe primer sequence and restriction site. The enzyme cocktaileffectively ligated blunt ended barcoded adaptors to the ends of therepaired DNA fragments in the droplet. Thus, after ligating the adapter,the nucleic acids typically included a fragment of genome flanked bybarcoded adaptors and wrapped about histones.

Breaking droplets. Since the nucleic acids were uniquely labeled withthe adapters by their cell of origin, they could subsequently becombined for epigenomic profiling. The droplets were pooled together andbroken into one aqueous volume. Before breaking the droplets into oneaqueous volume, dilution buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 1%Triton, 20 mM EGTA, and 20 mM EDTA) was added. The concentrations ofEGTA, EDTA, and detergent were thus maintained (i.e., at 20 mM, 10 mM,and 1%, respectively). 1H,1H,2H,2H-perfluoro-1-octanol, 97% (Sigma,370533-25G) was used to break the droplets.

Chromatin imunoprecipitation. Next 5 to 10 micrograms of H3K4me3antibody (Millipore, 17-614) were pre-bound by incubating with a mix ofProtein-A and Protein-G Dynabeads (Invitrogen, 100-02D and 100-07D,respectively) in blocking buffer (PBS supplemented with 0.5% TWEEN and0.5% BSA) for 2 hours. Washed beads were added to the chromatin lysate(aqueous phase of broken droplets) for overnight incubation. The sampleswere washed 6 times with RIPA buffer, twice with RIPA buffersupplemented with 500 mM NaCl, twice with LiCl buffer (10 mM TE, 250 mMLiCl, 0.5% NP-40, 0.5% DOC), twice with TE (10 mM Tris-HCl pH 8.0, 1 mMEDTA), and then eluted with 0.5% SDS, 300 mM NaCl, 5 mM EDTA, 10 mMTris-HCl pH 8.0 at 65° C. The eluant was incubated in 65° C. for 1 hour,then treated sequentially with RNaseA (Roche, 11119915001) for 30 minand Proteinase K (NEB, P8102S) for two hours. The DNA was purified usingAgencourt AMPure XP beads (Beckmangenomics, A63881).

Sequencing library preparation. Fragments within the enriched DNA samplethat had ligated adapters were selected by amplification followed byrestriction using the universal primer and the restriction sites on theadapters as described herein. The restriction event leaves a 3′ Aoverhang that was compatible with a next round of Ilumina adapterligation. The result was a sequencing library that was enriched withnucleic acids from the epigenomic enrichment assay. The identificationsequences or “barcodes” within the nucleic acids then served as indexesthat allowed the DNA fragments to be assigned to individual singlecells.

Computational pipeline. The first step included de-multiplexing of thesequenced reads, first, according to the Ilumina indexes and then forthe 8-mer indexes implemented on each read and hold cell origininformation for each fragment. Next, the Bowtie algorithm was used toalign the reads to the genome and a peak-caller (Scripture algorithm) tofind regions with significant signal to noise ratios. Finally, panningwas performed by clustering single cell epigenome profiles to elucidatethe heterogeneity in cell population, detect different cell types inmixed population, and/or extract other information from the cells.

Example 3

This is an example of profiling cellular populations at the single celllevel with drop-based microfluidics. Populations of cells havesubstantial heterogeneity that is important for their function andunderstanding. This variability is reflected in cell to cell variationsof epigenetic features such as DNA methylation, chromatin organization,mRNA levels, and protein expression. When characterizing a pool of cellsby conventional methods, these variations are quickly averaged andcannot be detected. To detect these variations, populations can besorted by phenotype prior to characterization but ideally, cells couldbe characterized one-by-one. The problem of averaging over multiplecells is exacerbated when a small number of cells differ from themajority of the population. An example is the case of rare variants thatare increasingly realized to underlie tumor biology and therapeuticresistance. Since the phenotype of these rare cells is yet to bediscovered, presorting them is not an option. Thus, a method forcharacterizing multiple single cells at very high throughput is neededfor understanding the behavior and function of biological systemsranging from developing blood cells to human tumors. Accordingly, thisexample illustrates scalable and flexible microfluidics methodologycapable of profiling chromatin state and RNA expression in thousands ofsingle cells, and thereby capturing the nature of populationheterogeneity at unprecedented scale.

Characterizing the genetic and epigenetic states of single cells is achallenge because the effective concentration of the contents of asingle cell is a million times smaller than that of typical samples thatpool many cells, and hence the rate of reactions becomes impracticallyslow. In some cases, the content of the cell may be amplified prior toits characterization, as was previously used to measure single cellgenetic variations or single cell RNA expression levels. However,amplifying the contents of single cells in wells is time consuming,expensive and thus not scalable to large numbers of cells; moreover,this solution is not relevant for assays involving proteins, which aredenatured during amplification. Instead, this example restores theeffective concentration in single cell assays by drastically decreasingthe reaction volume using drop-based microfluidics.

Droplet-based microfluidics use drops of water immersed in an inertcarrier fluid as minute reaction vessels that can be preciselycontrolled by microfluidic devices. As an example, the droplets may beroughly 10 micrometers in diameter, each containing about 1 pL of fluidsurrounded by a surfactant that both stabilizes the droplet to preventcoalescence with other droplets, and protects its interface to preventloss of reagents through surface adsorption. The reagents within thedroplets never touch the walls of the microfluidic device and fluidiccontrol may be achieved with the inert carrier fluid, totallyindependent of the droplets. The droplets can be formed, refilled,thermo-cycled, merged, split, sorted, etc. at rates of up to millionsper hour with exquisite control over individual droplets. Thus, dropletscan be used to compartmentalize millions of single cells per hour athigh concentrations, allowing measurements of millions of single cells.

In these examples, a flexible platform for high through profiling ofsingle cells is demonstrated. These examples thus show a general methodthat combines droplet-based microfluidics with genomics and DNAbarcoding to profile genetic and epigenetic features of single cells. Toanalyze diverse populations, cells are encapsulated at about one perdroplet, and then each droplet is fused with another droplet containingbillions of copies of a unique barcode used to tag the contents of thecell, e.g., contained within an adaptor. After tagging each cell, thedroplets are merged and downstream assays can be performed on the mix ofbarcoded cellular information before being sequenced. Upon sequencing,the cell of origin for each fragment can be identified by reading thebarcode. The platform is compatible with both DNA and RNA, uses ligationor hybridization to attach the barcodes and can be scaled up to a largenumber of cells.

This general method can be used, for example, to study epigeneticheterogeneity. Gene regulation in eukaryotes relies on the functionalpackaging of DNA into chromatin, a higher-order structure composed ofDNA, RNA, histones and associated proteins. Chromatin structure andfunction is regulated by post-translational modifications of thehistones, including acetylation, methylation and ubiquitinylation.Histone modifications (HM) can be mapped genome-wide, revealingtype-specific regulation states of cells that reflect lineage-specificgene expression, developmental programs or disease processes. Given thecentral role of HM in stem and cancer cells, it is likely thatepigenetic states differ between tumor cells and underlie theirfunctional heterogeneity. To map histone modifications across thegenome, antibodies are used to bind to specific modification of thechromatin complex units, or nucleosomes, and then the bound DNA issequenced in a protocol that is known as Chromatin Immuno-Precipitationsequencing (ChIP-Seq). However, mapping HM in single cells is notcurrently possible in other systems due to a low signal to noise ratiowhen performing ChIP on genomic material from less than 10,000 cells.These limitations can be overcome, for example, by uniquely barcodingthe DNA of multiple cells and then performing ChIP-Seq on a pool ratherthan on a single cell. Thus, the high signal to noise ratio typical ofChIP is maintained and single cell information is restored by readingthe barcodes.

To perform ChIP-Seq on single cells, cells are encapsulated one in adrop together with lysis buffer and Micrococcal Nuclease enzyme (MNase)that digests inter-nucleosomal DNA. After digestion, each dropletcontaining fragmentized nucleosomes from a single cell is merged with adrop containing billions of copies of a unique DNA adapter and aligation buffer as shown in FIG. 4. The merging transpires by applyingan electric field through an electrode positioned within the device. Toallow adequate sequencing coverage for the information obtained from thebarcoded cells, only 100 merged droplets are collected per ChIP-Seqexperiment. To ensure that each cell is tagged with a unique barcode,the barcode library contains at least 10 times more unique barcodes thanthe number of cells collected. Thus when collecting 100 cells, a librarycontaining 1152 different barcodes is used, ensuring that theprobability of barcoding two different cells with the same barcode islower than 5%. Barcoded nucleosomes can be merged with additionalnon-barcoded nucleosomes used as a biological buffer to ensure highsignal to noise ratios during the Immuno-Precipitation step. To enrichfor barcoded nucleosomal fragments, the adapters may be designed withadditional DNA sequences that are used as specific priming regions foramplification and as restriction sites for selection during thepreparation of the library for Illumina sequencing. Thus, althoughinitially the barcoded fragments make a negligible fraction of the DNAin the sample, the majority of sequenced reads are barcoded on bothends, as shown in FIG. 5A.

FIG. 4 shows microfluidics of a single cell ChIP-seq. FIG. 4A showscells are encapsulated in drops together with lysis buffer and digestiveenzyme. FIG. 4B shows that, after incubation, droplets are re-injectedinto another microfluidic device where they are fused with dropscontaining barcodes. FIG. 4C is an image of single cells beingencapsulated in a microfluidic droplet maker. FIG. 4D shows dropletscontaining barcodes are fused with drops containing cell lysate. Scalebars are 100 micrometers.

To demonstrate that the epigenetic profiles of single cells could bemeasured, two distinct murine cell lines, mES and mEF, wereencapsulated. Each cell line was separately tagged with differentbarcodes and then 50 merged drops from each cell line were collected andpooled together to undergo H3K4me3 ChIP-Seq. Thus, the cell type of eachbarcoded fragment was known a priori and could be compared to theseparation obtained from analyzing the sequenced data. Although only 50barcodes from each cell type are expected to be found in the sequenceddata, all barcodes were present when analyzing the data, as shown inFIG. 5A. This is believed to be a result of cross-contamination betweendroplets that may occur during droplet merging due to electro-wetting ofthe microfluidic channel near the electrodes that were used to merge thedroplets. Thus, to analyze our cellular information, barcodes that wereused to tag cells were identified as those possessing the largest numberof DNA fragments in the sample.

After aligning and filtering the reads, each of the chosen barcodestypically tagged 2-5,000 distinct DNA fragments from each single cell,representing a sparse binary vector spanning the murine genome withenrichment for positive entries in genomic regions that were wrappedaround H3K4me3 marked nucleosomes. The complete set of chosen barcodeswas represented as a sparse binary matrix, in which only 15,000 out ofthe 1 million genomic bins have reads from more than one cell and cantherefore be used to compare between them. Despite the sparseness of thedata, when aggregating all mES cells and all mEF cells the known profilemeasured in many cells in bulk was restored as shown in FIG. 5B;moreover, it was possible to separate the two types of cells from eachother in an unsupervised way based on the correlations between thevectors of the different cells, as shown in FIGS. 5C and 5D. All but onebarcode, each representing the data of a single cell, were successfullyclassified as originating from either mES or mEF cell linesdemonstrating that biologically relevant data could be measured fromsingle cells using ChIP-Seq.

Thus, FIG. 3A shows the number of unique reads per barcode is presentedfor a sample of 50 mES cells and 50 mEF cells after ChIP-Seq withH3K4me3. Lighter shades indicate reads with the same barcode on bothsides, while darker shades indicates reads with non-matching barcodes onboth sides. FIG. 3B shows representative reads from 8 different mEScells and the aggregated data of 50 mESs at the top. FIG. 3C is acorrelation matrix between genomic bins of 114 cells (50 mES, 64 mEF).The mES were the first 50 vectors while mEF were last 64, and theirseparation into two blocks of correlation was observed. After just 3iterations, an unsupervised algorithm based on the correlations betweeneach cell and two aggregates of cells could separate the twopopulations, as is shown in FIG. 3D.

Example 4

In this example, mouse embryonic stem cells (mES) were compared withmouse embryonic fibroblasts (mEF), using an embodiment of the invention.As shown in FIG. 8A, mESs were encoded with identification sequences1-576 (“barcodes”) while mEFs were encoded with identification sequences577-1152. (See previous examples for how cells can be encapsulated indroplets with adapters containing suitable identification sequences). Inthese experiments, the identification sequences were arbitrarily chosenand numbered 1-1152. After ligation, the populations of droplets werecombined together.

The droplets were then analyzed using ChIP-sequencing, as is shown inFIG. 8B. The H3K4me3 histone was studied in this example. 4.6 millioncells were studied, with 1 million distinct reads after alignment andfiltering. It was found that each of the relevant barcodes typicallytagged 2,000-5,000 distinct DNA fragments from a single cell,representing a sparse binary vector spanning the mouse genome. Of these,about 70% contained adapters on both ends of the DNA. About 10-20%included “cross talk,” i.e., the DNA was incorrectly labeled. For eachbarcode, there were about 3,000-10,000 cells or “reads” that wereidentified. In addition, the separate reads could be pooled together torestore the epigenomic profile that was measured in a population ofcells using more conventional techniques, as is shown in FIG. 8C, whereaggregates of 50 single-cell profiles for both mESs and mEFs werecompared to traditional protocols for detecting histones.

While several embodiments of the present invention have been describedand illustrated herein, those of ordinary skill in the art will readilyenvision a variety of other means and/or structures for performing thefunctions and/or obtaining the results and/or one or more of theadvantages described herein, and each of such variations and/ormodifications is deemed to be within the scope of the present invention.More generally, those skilled in the art will readily appreciate thatall parameters, dimensions, materials, and configurations describedherein are meant to be exemplary and that the actual parameters,dimensions, materials, and/or configurations will depend upon thespecific application or applications for which the teachings of thepresent invention is/are used. Those skilled in the art will recognize,or be able to ascertain using no more than routine experimentation, manyequivalents to the specific embodiments of the invention describedherein. It is, therefore, to be understood that the foregoingembodiments are presented by way of example only and that, within thescope of the appended claims and equivalents thereto, the invention maybe practiced otherwise than as specifically described and claimed. Thepresent invention is directed to each individual feature, system,article, material, kit, and/or method described herein. In addition, anycombination of two or more such features, systems, articles, materials,kits, and/or methods, if such features, systems, articles, materials,kits, and/or methods are not mutually inconsistent, is included withinthe scope of the present invention.

All definitions, as defined and used herein, should be understood tocontrol over dictionary definitions, definitions in documentsincorporated by reference, and/or ordinary meanings of the definedterms.

The indefinite articles “a” and “an,” as used herein in thespecification and in the claims, unless clearly indicated to thecontrary, should be understood to mean “at least one.”

The phrase “and/or,” as used herein in the specification and in theclaims, should be understood to mean “either or both” of the elements soconjoined, i.e., elements that are conjunctively present in some casesand disjunctively present in other cases. Multiple elements listed with“and/or” should be construed in the same fashion, i.e., “one or more” ofthe elements so conjoined. Other elements may optionally be presentother than the elements specifically identified by the “and/or” clause,whether related or unrelated to those elements specifically identified.Thus, as a non-limiting example, a reference to “A and/or B”, when usedin conjunction with open-ended language such as “comprising” can refer,in one embodiment, to A only (optionally including elements other thanB); in another embodiment, to B only (optionally including elementsother than A); in yet another embodiment, to both A and B (optionallyincluding other elements); etc.

As used herein in the specification and in the claims, “or” should beunderstood to have the same meaning as “and/or” as defined above. Forexample, when separating items in a list, “or” or “and/or” shall beinterpreted as being inclusive, i.e., the inclusion of at least one, butalso including more than one, of a number or list of elements, and,optionally, additional unlisted items. Only terms clearly indicated tothe contrary, such as “only one of” or “exactly one of,” or, when usedin the claims, “consisting of,” will refer to the inclusion of exactlyone element of a number or list of elements. In general, the term “or”as used herein shall only be interpreted as indicating exclusivealternatives (i.e. “one or the other but not both”) when preceded byterms of exclusivity, such as “either,” “one of,” “only one of,” or“exactly one of.” “Consisting essentially of,” when used in the claims,shall have its ordinary meaning as used in the field of patent law.

As used herein in the specification and in the claims, the phrase “atleast one,” in reference to a list of one or more elements, should beunderstood to mean at least one element selected from any one or more ofthe elements in the list of elements, but not necessarily including atleast one of each and every element specifically listed within the listof elements and not excluding any combinations of elements in the listof elements. This definition also allows that elements may optionally bepresent other than the elements specifically identified within the listof elements to which the phrase “at least one” refers, whether relatedor unrelated to those elements specifically identified. Thus, as anon-limiting example, “at least one of A and B” (or, equivalently, “atleast one of A or B,” or, equivalently “at least one of A and/or B”) canrefer, in one embodiment, to at least one, optionally including morethan one, A, with no B present (and optionally including elements otherthan B); in another embodiment, to at least one, optionally includingmore than one, B, with no A present (and optionally including elementsother than A); in yet another embodiment, to at least one, optionallyincluding more than one, A, and at least one, optionally including morethan one, B (and optionally including other elements); etc.

It should also be understood that, unless clearly indicated to thecontrary, in any methods claimed herein that include more than one stepor act, the order of the steps or acts of the method is not necessarilylimited to the order in which the steps or acts of the method arerecited.

In the claims, as well as in the specification above, all transitionalphrases such as “comprising,” “including,” “carrying,” “having,”“containing,” “involving,” “holding,” “composed of,” and the like are tobe understood to be open-ended, i.e., to mean including but not limitedto. Only the transitional phrases “consisting of” and “consistingessentially of” shall be closed or semi-closed transitional phrases,respectively, as set forth in the United States Patent Office Manual ofPatent Examining Procedures, Section 2111.03.

What is claimed is: 1-111. (canceled)
 112. A method, comprising:providing a plurality of droplets containing cells, the droplets formedof a first liquid contained within a second fluid; lysing the cellscontained within the droplets to produce cell lysate within thedroplets; attaching at least some of the sequences to an adapter, theadapter comprising an identification sequence and a restriction site,wherein the identification sequences allow identification of the cellswithin the plurality of droplets; wherein the adaptors are bound to asolid support; exposing at least some of the cell lysate containedwithin the droplets to a polymerase and/or a reverse transcriptionenzyme to produce a plurality of sequences within the droplets;sequencing some of the sequences.
 113. The method of claim 112, whereinthe adapters are released from the solid support.
 114. The method ofclaim 112, wherein the plurality of sequences comprise DNA sequences.115. The method of claim 112, wherein the plurality of sequencescomprise RNA sequences.
 116. The method of claim 113, wherein theadapters are released from the solid support by optical, chemical orenzymatic techniques.
 117. The method of claim 112, wherein a pluralityof adapters are used, comprising a plurality of different identificationsequences each having the same length and a substantially identicalrestriction site.
 118. The method of claim 112, wherein sequencing thesequences comprises performing chromatin immunoprecipitation (ChIP)sequencing on the sequences.
 119. The method of claim 112, wherein atleast some of the cells arise from tissue.
 120. The method of claim 112,wherein sequences originating from the same cell contain identicalidentification sequences and sequences originating from different cellscontain different identification sequences.
 121. The method of claim112, wherein at least some of the sequences are ligated to an adapter.122. The method of claim 112, wherein the act of ligating comprisesfusing the droplets containing the cell lysates to adapter dropletscontaining the solid phase, ligase and the adapters.
 123. A method,comprising: providing a plurality of droplets containing cells, thedroplets formed of a first liquid contained within a second fluid;lysing the cells contained within the droplets to produce cell lysatewithin the droplets; attaching at least some of the sequences to anadapter, the adapter comprising an identification sequence and arestriction site, wherein the identification sequences allowidentification of the cells within the plurality of droplets; whereinthe adaptors are bound to a solid support; sequencing some of thesequences.
 124. The method of claim 123, wherein the adapters arereleased from the solid support.
 125. The method of claim 124, whereinthe adapters are released from the solid support by optical, chemical orenzymatic techniques.
 126. The method of claim 123, wherein the act ofligating comprises fusing the droplets containing the cell lysates toadapter droplets containing the solid phase, ligase and the adapters.127. A method, comprising: providing a plurality of droplets containingcells, the droplets formed of a first liquid contained within a secondfluid; lysing the cells contained within the droplets to produce celllysate within the droplets; exposing at least some of the cell lysatecontained within the droplets to a nuclease to produce a plurality ofsequences within the droplets; attaching at least some of the sequencesto an adapter, the adapter comprising an identification sequence and arestriction site, wherein the identification sequences allowidentification of the cells within the plurality of droplets; whereinthe adaptors are bound to a solid support; sequencing some of thesequences.
 128. The method of claim 127, wherein the adapters arereleased from the solid support.
 129. The method of claim 128, whereinthe adapters are released from the solid support by optical, chemical orenzymatic techniques.
 130. The method of claim 127, wherein the act ofligating comprises fusing the droplets containing the cell lysates toadapter droplets containing the solid phase, ligase and the adapters.